59 research outputs found

    Text Mining with HathiTrust: Empowering Librarians to Support Digital Scholarship Research

    Get PDF
    This workshop will introduce attendees to text analysis research and the common methods and tools used in this emerging area of scholarship, with particular attention to the HathiTrust Research Center. The workshop\u27s train the trainer curriculum will provide a framework for how librarians can support text data mining, as well as teach transferable skills useful for many other areas of digital scholarly inquiry. Topics include: introduction to gathering, managing, analyzing, and visualizing textual data; hands-on experience with text analysis tools, including the HTRC\u27s off-the-shelf algorithms and datasets, such as the HTRC Extracted Features; and using the command line to run basic text analysis processes. No experience necessary! Attendees must bring a laptop

    Unconventional import pathways of the dually localised proteins Thioredoxin 1 and Glutathione peroxidase 3 into the mitochondrial intermembrane space

    Get PDF
    Mitochondria are essential organelles, underpinning a variety of vital cellular functions. However, mitochondrial DNA (mtDNA) encodes for only 13 proteins and therefore the remaining ~1000 proteins in the proteome of Saccharomyces cerevisiae require import into the organelle. All the proteins of the intermembrane space (IMS), a sub-compartment of mitochondria, require import. The main import pathway to the IMS is the mitochondrial import and assembly (MIA) pathway. As well as being the interface between the matrix and the cytosol, the IMS is the site of disulfide bond formation through oxidative folding by the MIA pathway. However, unlike other locations where disulfide bond formation occurs, i.e. the endoplasmic reticulum (ER) and the bacterial periplasm, no reductive pathway has yet been detailed for the IMS. As the major IMS import pathway, the MIA pathway, is subject to redox regulation, it is of interest to investigate proteins that may have a role in balancing the redox environment of the IMS. Having been previously identified as being dually localised between the IMS and the cytosol, the import pathways of the oxidase Glutathione peroxidase 3 (Gpx3) and the reductase Thioredoxin 1 (Trx1) remain unknown. The first part of this thesis focused on investigating the import pathway of Trx1, however as the project developed aims shifted to optimising the import protocol of Trx1. Whilst the import could not be optimised, parameters were identified that can be excluded from having a major role in this import pathway. The second part of this thesis focused on the import pathway of Gpx3 and components that may be involved. The role of the abundant outer membrane proteins Om14 and Om45 were specifically investigated in both a post and co-translational manner with the use of ribosome-stalled Gpx3 RNA. Whilst in a post-translational manner, no effect was observed in the import of Gpx3 into mitochondria containing no Om45, in a cotranslational system an increase in import was observed. This preliminary result suggests that Om45 may play a role in the import of Gpx3 and warrants further investigation in the future. Furthering understanding of the nuances involved in the various protein import pathways into mitochondria will ultimately aid the developments of therapeutics for mitochondrial diseases

    Digging, Reaching, and Learning: An Update on the First Year of the HathiTrust Research Center's Librarian Training Program

    Get PDF
    The HathiTrust Research Center is leading an IMLS-funded multi-institutional project that seeks to arm librarians with practical skills in text mining methods and knowledge of trends in digital scholarship. This poster will offer an overview of the instructional design approaches used in developing the curriculum, and an analysis of the assessment and revision process.Ope

    Text Data Mining Beyond the Open Data Paradigm: Perspectives at the Intersection of Intellectual Property and Ethics

    Get PDF
    This poster highlights outcomes from an IMLS-funded National Forum project on text data mining with content that is subject to use conditions due to intellectual property rights. It argues that developing strong frameworks for conducting text mining with IP-limited data is an urgent priority for supporting responsible, sustainable research in the twenty-first century.Institute for Museum and Library Services (LG-73-17-0070-17)Ope

    Scholarly Needs for Text Analysis Resources: A User Assessment Study for the HathiTrust Research Center

    Get PDF
    The HathiTrust Research Center (HTRC) is undertaking a study to better understand the needs of current and potential users of the center’s tools and services for computational text analysis. In this paper, we report on the results of the first phase of the study, which consisted of interviews with scholars, administrators, and librarians whose work involves text data mining. Our study reveals that text analysis workflows are specific to the individual research project and are often nonlinear. In spite of, and in some cases because of, the wealth of textual data available, scholars find it most difficult to locate, access, and curate textual data for their research. While the goals of the study directly relate to research and development for the HTRC, our results are useful for other large-scale data providers developing solutions for allowing computational access to their content

    Building a Bridge to Next Generation DH Services in Libraries with a Campus Needs Assessment

    Get PDF
    This poster reports on a needs assessment for digital humanities library services undertaken at large research university in order to provide a basis for transition to a next phase of Digital Humanities (DH) support at a library supporting a growing amount of DH work on campus. It reports key findings and how the library services will evolve to meet needs identified on campus. The full report on which this presentation is based is available at http://hdl.handle.net/2142/100081Ope

    Data Mining Research with In-copyright and Use-limited Text Datasets: Preliminary Findings from a Systematic Literature Review and Stakeholder Interviews

    Get PDF
    Text data mining and analysis has emerged as a viable research method for scholars, following the growth of mass digitization, digital publishing, and scholarly interest in data re-use. Yet the texts that comprise datasets for analysis are frequently protected by copyright or other intellectual property rights that limit their access and use. This paper discusses the role of libraries at the intersection of data mining and intellectual property, asserting that academic libraries are vital partners in enabling scholars to effectively incorporate text data mining into their research. We report on activities leading up to an IMLS-funded National Forum of stakeholders and discuss preliminary findings from a systematic literature review, as well as initial results of interviews with forum stakeholders. Emerging themes suggest the need for a multi-pronged distributed approach that includes a public campaign for building awareness and advocacy, development of best practice guides for library support services and training, and international efforts toward data standardization and copyright harmonization.Institute of Museum and Library Services (LG-73-17-0070-17)Ope

    HathiTrust Research Center User Requirements Study White Paper

    Get PDF
    This paper presents findings from an investigation into trends and practices in humanities and social sciences research that incorporates text data mining. As affiliates of the HathiTrust Research Center (HTRC), the purpose of our study was to illuminate researcher needs and expectations for text data, tools, and training for text mining in order to better understand our current and potential user community. Results of our study have and will continue to inform development of HTRC tools and services for computational text analysis.Ope

    Scholarly Commons Digital Humanities Needs Assessment Study

    Get PDF
    The members of the Digital Humanities Needs Assessment Working Group completed an analysis of current activities and future needs for digital humanities and digital scholarship-oriented research and teaching at the University of Illinois at Urbana Champaign. This study originated as an investigation into the particular practices and work of digital humanities researchers, and how the University Library could support the needs for digital humanities research, particularly via the resources and expertise provided in the Scholarly Commons. This report delivers findings gathered via interviews and follow-up survey, and analyzed by the Working Group. It identifies thematic Areas of Need and also proposed Recommendations for the Library.Ope

    From karyotypes to precision genomics in 9p deletion and duplication syndromes

    Get PDF
    While 9p deletion and duplication syndromes have been studied for several years, small sample sizes and minimal high-resolution data have limited a comprehensive delineation of genotypic and phenotypic characteristics. In this study, we examined genetic data from 719 individuals in the worldwide 9p Network Cohort: a cohort seven to nine times larger than any previous study of 9p. Most breakpoints occur in bands 9p22 and 9p24, accounting for 35% and 38% of all breakpoints, respectively. Bands 9p11 and 9p12 have the fewest breakpoints, with each accounting for 0.6% of all breakpoints. The most common phenotype in 9p deletion and duplication syndromes is developmental delay, and we identified eight known neurodevelopmental disorder genes in 9p22 and 9p24. Since it has been previously reported that some individuals have a secondary structural variant related to the 9p variant, we examined our cohort for these variants and found 97 events. The top secondary variant involved 9q in 14 individuals (1.9%), including ring chromosomes and inversions. We identified a gender bias with significant enrichment for females (p = 0.0006) that may arise from a sex reversal in some individuals with 9p deletions. Genes on 9p were characterized regarding function, constraint metrics, and protein-protein interactions, resulting in a prioritized set of genes for further study. Finally, we achieved precision genomics in one child with a complex 9p structural variation using modern genomic technologies, demonstrating that long-read sequencing will be integral for some cases. Our study is the largest ever on 9p-related syndromes and provides key insights into genetic factors involved in these syndromes
    • …
    corecore